chore: update pre-commit hooks by github-actions[bot] · Pull Request #9 · thad0ctor/axolotl

github-actions · 2026-05-01T00:33:26Z

Automated PR to update pre-commit hooks to their latest versions.

…est fixes Seven Minor items from the CodeRabbit full-diff re-scan on commit ``55377e5d``. **F-#2 — Clarify Mode-A guidance in ``protrain_optimizer_wrapper`` 8-bit warning (``api/optim_wrapper.py:802-815``).** The warning told users to set ``protrain_force_all_persistent: true`` to get end-to-end 8-bit AdamW on CPU-resident chunks, but didn't mention that ``protrain_force_all_persistent`` is ignored while ``protrain_auto_mode`` is on (the auto-mode selector picks the mode itself based on capacity). Expanded the warning to instruct users to set ``protrain_auto_mode: false`` AND ``protrain_force_all_persistent: true`` together. **F-#4 — Unify fragmentation-alpha docs in DESIGN.md.** Module summaries at lines 49 (``cost/memory.py``) and 118 (``memory.py`` module spec) still described a fixed ``alpha=1.10`` while Design Decision 1 documents the per-dtype lookup (``ALPHA_FRAGMENTATION_4BIT = 0.75`` for bnb-4-bit). Aligned both summaries to reference the per-dtype helper (``alpha_fragmentation_for_dtype``) and the design decision section. **F-#5 — Resolve ``use_reentrant`` contradiction in DESIGN.md.** Line 109 (``block/checkpoint.py`` module spec) said ``use_reentrant=False``, which matches the actual implementation (verified via ``grep`` against ``block/checkpoint.py:99``). Line 290 (audit Block G analysis) claimed ``use_reentrant=True, the production wrap`` — stale and incorrect. Updated the analysis text to acknowledge ``use_reentrant=False`` is the production wrap and re-stated the per-block-input residual mechanism in a form compatible with the non-reentrant variant (each CKPT block's saved-tensors-hooks recompute frame holds the block input, which is what produces the linear-in-N_block activation footprint the audit data exposes). **F-#8 — Centralized CUDA-availability guard in ``tests/protrain/test_adamw8bit_adapter.py::_gpu_device``.** The helper unconditionally returned ``torch.device("cuda:0")``, so a custom marker filter or conftest override that lands the module in a CPU-only context would surface as a torch error before any test body. Added a ``pytest.skip("CUDA not available; ...")`` early-return so every gpu-marked test in the module gets a clean skip. **F-#9 — Replace silent ``try/except: pass`` with ``contextlib.suppress(Exception)`` in ``tests/protrain/test_lora_offload_mode.py``.** Five sites — lines 742-746, 839-843, 906-910, 981-985, 1040-1044 — each had the same ``for h in handles: try: h.remove() except Exception: pass`` pattern that Ruff S110 flags. Replaced with ``contextlib.suppress(Exception)`` over the loop. Semantics unchanged (best-effort cleanup, tolerate already-removed handles or torch shutting down mid-test); intent now documented by the context manager. **F-#10 — ASCII ``x`` in ``test_lora_offload_mode.py:1062`` docstring.** Missed in the R5 unicode sweep — ``4×3090`` ⇒ ``4x3090``. **F-#11 — ``try/finally`` for ``wrapped.close()`` in 3 sites of ``test_trace_skip_on_override.py``.** ``test_run_trace_skipped_on_override_full_path`` (L255-282), ``test_run_trace_invoked_without_override`` (L319-337), and ``test_partial_overrides_do_not_skip_trace`` (L381-400) each called ``wrapped.close()`` only on the success path — assertion failures earlier in the test body would skip the close and leak CUDA + chunk resources into subsequent GPU tests. Wrapped each test body in ``try/finally`` so ``wrapped.close()`` always runs. Done programmatically via a one-shot Python rewrite (8 lines of new indent + 2 lines of try/finally per site) to keep the diff mechanical. ### Test gates - ``pre-commit run --all-files`` ALL green (ruff / ruff-format / mypy / bandit / yaml / eol / whitespace). - ``tests/protrain/`` default-marker: 313 passed / 4 skipped / 162 deselected / 0 failed. - GPU sanity on F-touched files (GPU 5): 43 passed / 2 skipped / 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Fixes pre-commit failures on CI after the ARCH #8/#9/#10 commits: ruff-format auto-format on 8 files (line-wrap of comprehensions and MagicMock(spec=...) calls; alphabetize one multi-import block; strip a trailing blank line in a test header) and add the missing `Any` symbol that `cast("Any", ...)` in test_modec_persistent_partition.py referenced without import.

…rt-plugin auto_memory Add two config-completeness guards that mirror commit 342e1bd's DDP+zero3 validator pattern (detect known-bad composition at config time, fail or warn loudly with an actionable message). 1. args.py `_guard_lora_mlp_kernel_with_mode_bc` model_validator hard-rejects `lora_mlp_kernel: true` combined with `protrain_force_replicated_cpu_offload: true` or `protrain_zero3_shard: true` (the v61 LoRA_MLPBackward crash is deterministic on Mode-B/C-forced configs) and warns on `protrain_auto_mode: true` (searcher might pick Mode B). Closes proposal §6.qq / §16 PR #10. 2. plugin.py `_maybe_warn_inert_plugin` fires a one-shot LOG.warning from `pre_model_load` when the plugin is listed but `protrain_auto_memory` is falsy — surfaces the inert-plugin failure mode that produced v15-v52's vanilla-axolotl "measurements". Module-level flag keeps it idempotent. Closes proposal §16 PR #9. Tests in tests/protrain/test_lora_mlp_kernel_mode_b_validator.py (11 new).

…est fixes Seven Minor items from the CodeRabbit full-diff re-scan on commit ``55377e5d``. **F-#2 — Clarify Mode-A guidance in ``protrain_optimizer_wrapper`` 8-bit warning (``api/optim_wrapper.py:802-815``).** The warning told users to set ``protrain_force_all_persistent: true`` to get end-to-end 8-bit AdamW on CPU-resident chunks, but didn't mention that ``protrain_force_all_persistent`` is ignored while ``protrain_auto_mode`` is on (the auto-mode selector picks the mode itself based on capacity). Expanded the warning to instruct users to set ``protrain_auto_mode: false`` AND ``protrain_force_all_persistent: true`` together. **F-#4 — Unify fragmentation-alpha docs in DESIGN.md.** Module summaries at lines 49 (``cost/memory.py``) and 118 (``memory.py`` module spec) still described a fixed ``alpha=1.10`` while Design Decision 1 documents the per-dtype lookup (``ALPHA_FRAGMENTATION_4BIT = 0.75`` for bnb-4-bit). Aligned both summaries to reference the per-dtype helper (``alpha_fragmentation_for_dtype``) and the design decision section. **F-#5 — Resolve ``use_reentrant`` contradiction in DESIGN.md.** Line 109 (``block/checkpoint.py`` module spec) said ``use_reentrant=False``, which matches the actual implementation (verified via ``grep`` against ``block/checkpoint.py:99``). Line 290 (audit Block G analysis) claimed ``use_reentrant=True, the production wrap`` — stale and incorrect. Updated the analysis text to acknowledge ``use_reentrant=False`` is the production wrap and re-stated the per-block-input residual mechanism in a form compatible with the non-reentrant variant (each CKPT block's saved-tensors-hooks recompute frame holds the block input, which is what produces the linear-in-N_block activation footprint the audit data exposes). **F-#8 — Centralized CUDA-availability guard in ``tests/protrain/test_adamw8bit_adapter.py::_gpu_device``.** The helper unconditionally returned ``torch.device("cuda:0")``, so a custom marker filter or conftest override that lands the module in a CPU-only context would surface as a torch error before any test body. Added a ``pytest.skip("CUDA not available; ...")`` early-return so every gpu-marked test in the module gets a clean skip. **F-#9 — Replace silent ``try/except: pass`` with ``contextlib.suppress(Exception)`` in ``tests/protrain/test_lora_offload_mode.py``.** Five sites — lines 742-746, 839-843, 906-910, 981-985, 1040-1044 — each had the same ``for h in handles: try: h.remove() except Exception: pass`` pattern that Ruff S110 flags. Replaced with ``contextlib.suppress(Exception)`` over the loop. Semantics unchanged (best-effort cleanup, tolerate already-removed handles or torch shutting down mid-test); intent now documented by the context manager. **F-#10 — ASCII ``x`` in ``test_lora_offload_mode.py:1062`` docstring.** Missed in the R5 unicode sweep — ``4×3090`` ⇒ ``4x3090``. **F-#11 — ``try/finally`` for ``wrapped.close()`` in 3 sites of ``test_trace_skip_on_override.py``.** ``test_run_trace_skipped_on_override_full_path`` (L255-282), ``test_run_trace_invoked_without_override`` (L319-337), and ``test_partial_overrides_do_not_skip_trace`` (L381-400) each called ``wrapped.close()`` only on the success path — assertion failures earlier in the test body would skip the close and leak CUDA + chunk resources into subsequent GPU tests. Wrapped each test body in ``try/finally`` so ``wrapped.close()`` always runs. Done programmatically via a one-shot Python rewrite (8 lines of new indent + 2 lines of try/finally per site) to keep the diff mechanical. ### Test gates - ``pre-commit run --all-files`` ALL green (ruff / ruff-format / mypy / bandit / yaml / eol / whitespace). - ``tests/protrain/`` default-marker: 313 passed / 4 skipped / 162 deselected / 0 failed. - GPU sanity on F-touched files (GPU 5): 43 passed / 2 skipped / 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Fixes pre-commit failures on CI after the ARCH #8/#9/#10 commits: ruff-format auto-format on 8 files (line-wrap of comprehensions and MagicMock(spec=...) calls; alphabetize one multi-import block; strip a trailing blank line in a test header) and add the missing `Any` symbol that `cast("Any", ...)` in test_modec_persistent_partition.py referenced without import.

…rt-plugin auto_memory Add two config-completeness guards that mirror commit 342e1bd's DDP+zero3 validator pattern (detect known-bad composition at config time, fail or warn loudly with an actionable message). 1. args.py `_guard_lora_mlp_kernel_with_mode_bc` model_validator hard-rejects `lora_mlp_kernel: true` combined with `protrain_force_replicated_cpu_offload: true` or `protrain_zero3_shard: true` (the v61 LoRA_MLPBackward crash is deterministic on Mode-B/C-forced configs) and warns on `protrain_auto_mode: true` (searcher might pick Mode B). Closes proposal §6.qq / §16 PR #10. 2. plugin.py `_maybe_warn_inert_plugin` fires a one-shot LOG.warning from `pre_model_load` when the plugin is listed but `protrain_auto_memory` is falsy — surfaces the inert-plugin failure mode that produced v15-v52's vanilla-axolotl "measurements". Module-level flag keeps it idempotent. Closes proposal §16 PR #9. Tests in tests/protrain/test_lora_mlp_kernel_mode_b_validator.py (11 new).

thad0ctor · 2026-06-05T23:48:32Z

Closing: fork-only chore that diverges from upstream's pinned hook versions (ruff v0.15.8 / mypy v1.19.1). The mypy 1.x->2.x bump isn't adopted upstream and risks lint drift on rebases. Not needed.

thad0ctor mentioned this pull request May 12, 2026

Phase 2: ProTrain integration with Axolotl perf features (M0–M6C closed) #21

Closed

5 tasks

thad0ctor mentioned this pull request May 23, 2026

ProTrain integration: chunk-managed weight offload + per-dtype memory cost model #24

Closed

5 tasks

chore: update pre-commit hooks

cd44923

github-actions Bot force-pushed the update/pre-commit-hooks branch from b0fdb3e to cd44923 Compare June 1, 2026 00:40

Merge upstream/main into update/pre-commit-hooks

1347a64

thad0ctor closed this Jun 5, 2026

thad0ctor deleted the update/pre-commit-hooks branch June 5, 2026 23:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: update pre-commit hooks#9

chore: update pre-commit hooks#9
github-actions[bot] wants to merge 2 commits into
mainfrom
update/pre-commit-hooks

github-actions Bot commented May 1, 2026

Uh oh!

thad0ctor commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

github-actions Bot commented May 1, 2026

Uh oh!

thad0ctor commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant